Skip to content

Conversation

@kaijchen
Copy link
Contributor

@kaijchen kaijchen commented Dec 25, 2025

What type of PR is this?

feat

Check the PR title.

  • This PR title match the format: <type>(optional scope): <description>
  • The description of this PR title is user-oriented and clear enough for others to understand.
  • Attach the PR updating the user documentation if the current PR requires user awareness at the usage level. User docs repo

(Optional) Translate the PR title into Chinese.

feat(milvus2): 新增 milvus2 indexer 和 retriever 组件

(Optional) More detailed description for this PR(en: English/zh: Chinese).

en:
This PR introduces new Milvus 2.x components (indexer and retriever) under components/indexer/milvus2 and components/retriever/milvus2, using the latest milvus-io/milvus/client/v2 SDK.

Indexer Features:

  • Auto Management: automatically handles collection schema creation, index building, and loading.
  • Comprehensive Index Support: Auto, HNSW, IVF (Flat, PQ, SQ8), RaBitQ (2.6+), FLAT, DiskANN, SCANN, and GPU Indexes (BruteForce, IVF_Flat, IVF_PQ, Cagra).
  • Hybrid Search Ready: Native support for Sparse Vectors (BM25/SPLADE) alongside Dense Vectors.
  • Service-side Processing: Support for Milvus Functions (e.g., auto-generating sparse vectors via BM25) and Analyzers.
  • Flexible Data Handling: Dynamic schema support and custom document-to-column conversion.

Retriever Features:

  • Multiple Search Modes: Approximate, Range, Hybrid (Dense + Sparse with RRF), Iterator, and Scalar search.
  • Advanced Filtering: Support for score thresholds, range filters, and metadata filtering.
  • Result Customization: Grouping support and custom result-to-document conversion.

Examples included:

  • Indexer: demo, hnsw, ivf_flat, rabitq, auto, diskann, hybrid (sparse), byov.
  • Retriever: approximate, range, hybrid (dense+sparse), iterator, scalar, grouping, filtered.

zh:
本 PR 在 components/indexer/milvus2components/retriever/milvus2 目录下引入了基于最新 milvus-io/milvus/client/v2 SDK 的 Milvus 2.x 组件(indexer 和 retriever)。

Indexer 特性:

  • 自动管理:自动处理集合 Schema 创建、索引构建和加载。
  • 全面的索引支持:支持 Auto、HNSW、IVF (Flat, PQ, SQ8)、RaBitQ (2.6+)、FLAT、DiskANN、SCANN 以及 GPU 索引 (BruteForce, IVF_Flat, IVF_PQ, Cagra)。
  • 混合搜索就绪:原生支持稀疏向量 (BM25/SPLADE) 与稠密向量共存。
  • 服务端处理:支持 Milvus Functions(如通过 BM25 自动生成稀疏向量)和 Analyzers(分词器)。
  • 灵活的数据处理:支持动态 Schema 和自定义文档到列的转换。

Retriever 特性:

  • 多种搜索模式:Approximate、Range、Hybrid (Dense + Sparse with RRF)、Iterator 和 Scalar 搜索。
  • 高级过滤:支持分数阈值、范围过滤和元数据过滤。
  • 结果定制:支持分组 (Grouping) 和自定义结果到文档的转换。

包含示例:

  • Indexer: demo, hnsw, ivf_flat, rabitq, auto, diskann, hybrid, byov.
  • Retriever: approximate, range, hybrid, iterator, scalar, grouping, filtered.

(Optional) Which issue(s) this PR fixes:

(optional) The PR that updates user documentation:

@kaijchen kaijchen force-pushed the milvus2 branch 4 times, most recently from 99df3b2 to 6f069fc Compare December 26, 2025 08:04
@hi-pender hi-pender self-requested a review December 26, 2025 12:47
@hi-pender hi-pender merged commit e665926 into cloudwego:main Jan 15, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants